Read community sequences

There are two OTU tables and I named them arbitrarily:

  1. communities_abundance: raw data from Nanxi
  2. communities_abundance_syl: curated community ESV table from Sylvie

I formated the data.frame so that both of them have the following variables:

In communities_abundance_syl, there are other parameters

communities_abundance_syl

Community richness along transfers based on communities_abundance

Match community ESVs and isolate Sanger sequences

Save the alignment result

IUPAC notation for DNA base pairs

W, S, M, K, R, Y represent two possiblities for one base pair B, D, H, V represent three possiblitieis N means any nucleotide (but not a gap)

I tested five alignment methods in Biostrings::pairwiseAlignment()

[1] "global"       "local"        "overlap"      "global-local" "local-global"

Number of isolates being matched to the community ESV if we allow for up to two mismatches.

Up to one mistmatch

No mistmatch

Relative abundance explained by the sequences

Abundances explained by ESVs

Abundances explained by ESVs in 13 communties

Abundance explained by isolate sequences, allowing up to two mismatch. C2R6 and C10R2 both miss one isolate’s Sanger sequences.

Abundance explained by isolate sequences. Rows are different alignment algorithms used by Biostring::pairwiseAlignment(), whereas columns are how many bp mismatches are allowed in ESV-Sanger matches.

LS0tCnRpdGxlOiAiU2VxdWVuY2UgYWJ1bmRhbmNlcyBpbiBjb21tdW5pdHkiCmF1dGhvcjogIkNoYW5nLVl1IENoYW5nIgpkYXRlOiAiYHIgU3lzLkRhdGUoKWAiCm91dHB1dDoKICBodG1sX25vdGVib29rOgogICAgbnVtYmVyX3NlY3Rpb25zOiBubwogICAgdG9jOiB5ZXMKICBib29rZG93bjo6cGRmX2RvY3VtZW50MjoKICAgIG51bWJlcl9zZWN0aW9uczogbm8KICAgIHRvYzogbm8KICBodG1sX2RvY3VtZW50OgogICAgZGZfcHJpbnQ6IHBhZ2VkCiAgICB0b2M6IHllcwogIHBkZl9kb2N1bWVudDoKICAgIHRvYzogeWVzCmxpbmtjb2xvcjogcmVkCmZvbnRzaXplOiAxMnB0CnVybGNvbG9yOiBibHVlCi0tLQoKYGBge3Igc2V0dXAsIGluY2x1ZGUgPSBGQUxTRX0KIyBLbml0ciBvcHRpb25zCmtuaXRyOjpvcHRzX2NodW5rJHNldCgKICBjYWNoZSA9IFRSVUUsCiAgZWNobyA9IEZBTFNFLAogIGZpZy5hbGlnbiA9ICJjZW50ZXIiLAogIGZpZy5oZWlnaHQgPSA0LAogIGZpZy53aWR0aCA9IDYpCgojIExpYnJhcnkKbGlicmFyeSh0aWR5dmVyc2UpCmxpYnJhcnkoZGF0YS50YWJsZSkKbGlicmFyeShpbnZuZXQpCgojIExvY2FsIGRpcmVjdG9yeSBhbmQgZnVuY3Rpb25zCnJvb3QgPC0gcnByb2pyb290Ojppc19yX3BhY2thZ2UgIyBQYWNrYWdlIHJvb3QKc291cmNlKHJvb3QkZmluZF9maWxlKCJtaXNjL3V0aWxzLlIiKSkKc291cmNlKHJvb3QkZmluZF9maWxlKCJtaXNjL3NlcXVlbmNlX3BhaXJ3aXNlX2FsaWdubWVudC5SIikpCndyaXRlX2FsbF9jc3YgPC0gVFJVRQp3cml0ZV9hbGxfcGRmIDwtIFRSVUUKCiMgUGFyYW1ldGVycyAKaXNvbGF0ZXNfSURfbWF0Y2ggPC0gZnJlYWQocm9vdCRmaW5kX2ZpbGUoImRhdGEvdGVtcC9pc29sYXRlc19JRF9tYXRjaC5jc3YiKSkKY29tbXVuaXRpZXMgPC0gZnJlYWQocm9vdCRmaW5kX2ZpbGUoImRhdGEvdGVtcC9jb21tdW5pdGllcy5jc3YiKSkKY29tbXVuaXRpZXNfbmFtZSA8LSBjb21tdW5pdGllcyRDb21tdW5pdHkKY29tbXVuaXRpZXNfc2l6ZSA8LSBjb21tdW5pdGllcyRDb21tdW5pdHlTaXplCmNvbW11bml0aWVzX25hbWVfcG9vbCA8LSBjKHBhc3RlMCgiQyIsIDE6MTIsICJScG9vbCIpLCBwYXN0ZTAoIkMiLCByZXAoMToxMiwgZWFjaCA9IDgpLCAiUiIsIHJlcCgxOjgsIDEyKSkpCmZhbWlsaWVzX25hbWUgPC0gYygiQWVyb21vbmFkYWNlYWUiLCAiQWxjYWxpZ2VuYWNlYWUiLCAiQnJhZHlyaGl6b2JpYWNlYWUiLCAiQnJ1Y2VsbGFjZWFlIiwgIkJ1cmtob2xkZXJpYWNlYWUiLCAiQ2F1bG9iYWN0ZXJhY2VhZSIsICJDZWxsdWxvbW9uYWRhY2VhZSIsICJDaGl0aW5vcGhhZ2FjZWFlIiwgIkNodGhvbmlvYmFjdGVyYWNlYWUiLCAiQ29tYW1vbmFkYWNlYWUiLCAiQ3J5b21vcnBoYWNlYWUiLCAiRW50ZXJvYmFjdGVyaWFjZWFlIiwgIkVudGVyb2NvY2NhY2VhZSIsICJGbGF2b2JhY3RlcmlhY2VhZSIsICJIeXBob21pY3JvYmlhY2VhZSIsICJMaXN0ZXJpYWNlYWUiLCAiTWljcm9iYWN0ZXJpYWNlYWUiLCAiTW9yYXhlbGxhY2VhZSIsICJOb2NhcmRpYWNlYWUiLCAiT2JzY3VyaWJhY3RlcmFsZXMuMTciLCAiT3hhbG9iYWN0ZXJhY2VhZSIsICJQYWVuaWJhY2lsbGFjZWFlIiwgIlBoeWxsb2JhY3RlcmlhY2VhZSIsICJQb3JwaHlyb21vbmFkYWNlYWUiLCAiUHNldWRvbW9uYWRhY2VhZSIsICJSaGl6b2JpYWNlYWUiLCAiU2FuZ3VpYmFjdGVyYWNlYWUiLCAiU3BoaW5nb2JhY3RlcmlhY2VhZSIsICJTcGhpbmdvbW9uYWRhY2VhZSIsICJYYW50aG9tb25hZGFjZWFlIikKYGBgCgoKIyBSZWFkIGNvbW11bml0eSBzZXF1ZW5jZXMKCmBgYHtyfQpzb3VyY2UoInNjcmlwdC8wMUUtbWF0Y2hfY29tbXVuaXR5X2FidW5kYW5jZS0wMS1jb21tdW5pdHlfc2VxdWVuY2UuUiIpCmBgYAoKVGhlcmUgYXJlIHR3byBPVFUgdGFibGVzIGFuZCBJIG5hbWVkIHRoZW0gYXJiaXRyYXJpbHk6CgoxLiBgY29tbXVuaXRpZXNfYWJ1bmRhbmNlYDogcmF3IGRhdGEgZnJvbSBOYW54aQoyLiBgY29tbXVuaXRpZXNfYWJ1bmRhbmNlX3N5bGA6IGN1cmF0ZWQgY29tbXVuaXR5IEVTViB0YWJsZSBmcm9tIFN5bHZpZQoKCkkgZm9ybWF0ZWQgdGhlIGRhdGEuZnJhbWUgc28gdGhhdCBib3RoIG9mIHRoZW0gaGF2ZSB0aGUgZm9sbG93aW5nIHZhcmlhYmxlczoKCi0gYFNhbXBsZUlEYDogZXhwZXJpbWVudCBJRCB1c2VkIGJ5IE5hbnhpIG9yIFN5bHZpZS4KLSBgQ29tbXVuaXR5YDogY29tbXVuaXR5LCBmb3IgaW5zdGFuY2UsIEMyUjQgb3IgQzFScG9vbC4KLSBgVHJhbnNmZXJgOiB0aGUgdHJhbnNmZXIgd2hlbiB0aGUgY29tbXVuaXR5IHdhcyBzZXF1ZW5jZWQuCi0gYEFkdW5kYW5jZWA6IHRoZSBudW1iZXIgb2YgdGhpcyBzZXF1ZW5jZSBpbiB0aGUgY29tbXVuaXR5LgotIGBSZWxhdGl2ZUFidW5kYW5jZWAKLSBgQ29tbXVuaXR5RVNWSURgOiB0aGUgc2VxdWVuY2UgaWRlbnRpZmllci4gVGhpcyBpZGVudGlmaWVyIGlzIGNvbW11bml0eS1zZXF1ZW5jZSBzcGVjaWZpYy4KLSBgRVNWYDogdGhlIEROQSBzZXF1ZW5jZSBpbiAxNnMgVjQgcmVnaW9uLgoKSW4gYGNvbW11bml0aWVzX2FidW5kYW5jZV9zeWxgLCB0aGVyZSBhcmUgb3RoZXIgcGFyYW1ldGVycwoKLSBgQ2FyYm9uU291cmNlYDogdGhlIGNhcmJvbiBzb3VyY2UgdXNlZCBmb3IgYXNzZW1ibHkKLSBgT3JkZXJgLCBgRmFtaWx5YCwgYW5kIGBHZW51c2A6IFNJTFZBIGFzc2lnbmVkIHRheG9ub215CgoKYGBge3IgZWNobyA9IFR9CmNvbW11bml0aWVzX2FidW5kYW5jZSAKYGBgCgoKYGBge3IgZWNobyA9IFR9CmNvbW11bml0aWVzX2FidW5kYW5jZV9zeWwKYGBgCgoKQ29tbXVuaXR5IHJpY2huZXNzIGFsb25nIHRyYW5zZmVycyBiYXNlZCBvbiBgY29tbXVuaXRpZXNfYWJ1bmRhbmNlYAoKYGBge3J9CmNvbW11bml0aWVzX3JpY2huZXNzICU+JQogIGdyb3VwX2J5KENvbW11bml0eSwgVHJhbnNmZXIpICU+JQogIHN1bW1hcml6ZShSaWNobmVzcyA9IG1lYW4oUmljaG5lc3MpKSAlPiUKICBnZ3Bsb3QoYWVzKGNvbCA9IENvbW11bml0eSwgZ3JvdXAgPSBDb21tdW5pdHkpKSArCiAgZ2VvbV9wb2ludChhZXMoeCA9IFRyYW5zZmVyLCB5ID0gUmljaG5lc3MpKSArCiAgZ2VvbV9saW5lKGFlcyh4ID0gVHJhbnNmZXIsIHkgPSBSaWNobmVzcykpICsKICBzY2FsZV9jb2xvcl9kaXNjcmV0ZSgpICsKIyAgZmFjZXRfd3JhcChDb21tdW5pdHl+Liwgc2NhbGUgPSAiZnJlZV95IikgKwogIHRoZW1lX2J3KCkKYGBgCgoKYGBge3J9CmlmICh3cml0ZV9hbGxfY3N2KSB7CiAgZndyaXRlKGNvbW11bml0aWVzX2FidW5kYW5jZSwgcm9vdCRmaW5kX2ZpbGUoImRhdGEvdGVtcC9jb21tdW5pdGllc19hYnVuZGFuY2UuY3N2IikpCiAgZndyaXRlKGNvbW11bml0aWVzX3JpY2huZXNzLCByb290JGZpbmRfZmlsZSgiZGF0YS90ZW1wL2NvbW11bml0aWVzX3JpY2huZXNzLmNzdiIpKQogIGZ3cml0ZShjb21tdW5pdGllc19hYnVuZGFuY2Vfc3lsLCByb290JGZpbmRfZmlsZSgiZGF0YS90ZW1wL2NvbW11bml0aWVzX2FidW5kYW5jZV9zeWwuY3N2IikpCn0KYGBgCgoKCiMgTWF0Y2ggY29tbXVuaXR5IEVTVnMgYW5kIGlzb2xhdGUgU2FuZ2VyIHNlcXVlbmNlcwoKYGBge3J9CiMgSXQgbWF5IHRha2UgMTAgbWlucwpzb3VyY2UoInNjcmlwdC8wMUUtbWF0Y2hfY29tbXVuaXR5X2FidW5kYW5jZS0wMi1tYXRjaF9pc29sYXRlXzE2Uy5SIikKYGBgCgpTYXZlIHRoZSBhbGlnbm1lbnQgcmVzdWx0CgpgYGB7cn0KaWYgKHdyaXRlX2FsbF9jc3YpIHsKIyAgZndyaXRlKHNlcXVlbmNlc19hbGlnbm1lbnQsIHJvb3QkZmluZF9maWxlKCJkYXRhL3RlbXAvc2VxdWVuY2VzX2FsaWdubWVudC5jc3YiKSkgCiAgZndyaXRlKHNlcXVlbmNlc19hbGlnbm1lbnRfc3lsLCByb290JGZpbmRfZmlsZSgiZGF0YS90ZW1wL3NlcXVlbmNlc19hbGlnbm1lbnRfc3lsLmNzdiIpKQp9CmBgYAoKSVVQQUMgbm90YXRpb24gZm9yIEROQSBiYXNlIHBhaXJzCgpXLCBTLCBNLCBLLCBSLCBZIHJlcHJlc2VudCB0d28gcG9zc2libGl0aWVzIGZvciBvbmUgYmFzZSBwYWlyCkIsIEQsIEgsIFYgcmVwcmVzZW50IHRocmVlIHBvc3NpYmxpdGllaXMKTiBtZWFucyBhbnkgbnVjbGVvdGlkZSAoYnV0IG5vdCBhIGdhcCkKCkkgdGVzdGVkIGZpdmUgYWxpZ25tZW50IG1ldGhvZHMgaW4gYEJpb3N0cmluZ3M6OnBhaXJ3aXNlQWxpZ25tZW50KClgCgpgYGB7cn0KYWxpZ25tZW50X3R5cGUKYGBgCgoKYGBge3IgZWNobyA9IFQgfQpzZXF1ZW5jZXNfYWxpZ25tZW50X3N5bApgYGAKCgpgYGB7cn0KIyBSIGZ1bmN0aW9uIGZvciBjb21wdXRlIGRpc3RpbmN0IFNhbmdlcgpkaXN0aW5jdF9zYW5nZXIgPC0gZnVuY3Rpb24oeCwgYWxsb3dfbWlzbWF0Y2ggPSAyKSB7CiAgc2VxdWVuY2VzX2FsaWdubWVudF9zeWwgJT4lCiAgICAjIEZpbHRlciBmb3IgQmFzZVBhaXJNYXRjaAogICAgZmlsdGVyKEJhc2VQYWlyTWlzbWF0Y2ggPD0gYWxsb3dfbWlzbWF0Y2gpICU+JQogICAgIyBGb3IgZWFjaCBTYW5nZXIsIGZpbmQgdGhlIFNhbmdlci1FU1YgbWF0Y2ggd2l0aCBoaWdoZXN0IGFsaWdubWVudCBzY29yZQogICAgZ3JvdXBfYnkoQWxpZ25tZW50VHlwZSwgRXhwSUQpICU+JQogICAgYXJyYW5nZShkZXNjKEFsaWdubWVudFNjb3JlKSkgJT4lCiAgICBkcGx5cjo6c2xpY2UoMSkgJT4lCiAgICB1bmdyb3VwKCkgJT4lCiAgICAjIFJlbW92ZSBkdXBsaWNhdGVzIHRoYXQgbWF0Y2hlcyB0d28gU2FuZ2VycyB0byBvbmUgRVNWCiAgICBncm91cF9ieShBbGlnbm1lbnRUeXBlLCBDb21tdW5pdHksIENvbW11bml0eUVTVklEKSAlPiUKICAgIGRpc3RpbmN0KFJlbGF0aXZlQWJ1bmRhbmNlLCAua2VlcF9hbGwgPSBUKSAlPiUKICAgIGFycmFuZ2UoQ29tbXVuaXR5KSAlPiUKICAgICMgU3BlY2lmeSBtaXNtYXRjaCBhbGxvd2VkCiAgICBtdXRhdGUoQWxsb3dNaXNtYXRjaCA9IGFsbG93X21pc21hdGNoKSAlPiUKICAgIGdyb3VwX2J5KEFsaWdubWVudFR5cGUpICU+JQogICAgc3VtbWFyaXplKENvdW50ID0gbigpKQp9CmBgYAoKTnVtYmVyIG9mIGlzb2xhdGVzIGJlaW5nIG1hdGNoZWQgdG8gdGhlIGNvbW11bml0eSBFU1YgaWYgd2UgYWxsb3cgZm9yIHVwIHRvIHR3byBtaXNtYXRjaGVzLiAKCmBgYHtyfQpkaXN0aW5jdF9zYW5nZXIoc2VxdWVuY2VzX2FsaWdubWVudF9zeWwsIGFsbG93X21pc21hdGNoID0gMikKYGBgCgpVcCB0byBvbmUgbWlzdG1hdGNoCgpgYGB7cn0KZGlzdGluY3Rfc2FuZ2VyKHNlcXVlbmNlc19hbGlnbm1lbnRfc3lsLCBhbGxvd19taXNtYXRjaCA9IDEpCmBgYAoKTm8gbWlzdG1hdGNoCgpgYGB7cn0KZGlzdGluY3Rfc2FuZ2VyKHNlcXVlbmNlc19hbGlnbm1lbnRfc3lsLCBhbGxvd19taXNtYXRjaCA9IDApCmBgYAoKCiMgUmVsYXRpdmUgYWJ1bmRhbmNlIGV4cGxhaW5lZCBieSB0aGUgc2VxdWVuY2VzIAoKYGBge3IgbWVzc2FnZSA9IEYsIHdhcm5pbmcgPSBGfQpzb3VyY2UoInNjcmlwdC8wMUUtbWF0Y2hfY29tbXVuaXR5X2FidW5kYW5jZS0wMy1tYXRjaGVkX2FidW5kYW5jZS5SIikKYGBgCgpgYGB7cn0KaWYgKHdyaXRlX2FsbF9jc3YpIHsKICBmd3JpdGUoc2VxdWVuY2VzX2FidW5kYW5jZSwgZmlsZSA9IHJvb3QkZmluZF9maWxlKCJkYXRhL3RlbXAvc2VxdWVuY2VzX2FidW5kYW5jZS5jc3YiKSkKICBmd3JpdGUoaXNvbGF0ZXNfYWJ1bmRhbmNlLCBmaWxlID0gcm9vdCRmaW5kX2ZpbGUoImRhdGEvdGVtcC9pc29sYXRlc19hYnVuZGFuY2UuY3N2IikpCn0KYGBgCgoKQWJ1bmRhbmNlcyBleHBsYWluZWQgYnkgRVNWcwoKYGBge3J9CmNvbW11bml0aWVzX2FidW5kYW5jZV9zeWwgJT4lCiAgZmlsdGVyKENhcmJvblNvdXJjZSA9PSAiR2x1Y29zZSIpICU+JQogIHBsb3RfYWJ1bmRhbmNlKGxhYmVsX3ggPSAiQ29tbXVuaXR5IiwgbGFiZWxfeSA9ICJSZWxhdGl2ZUFidW5kYW5jZSIsIGZpbGwgPSAiRmFtaWx5IikKYGBgCgpBYnVuZGFuY2VzIGV4cGxhaW5lZCBieSBFU1ZzIGluIDEzIGNvbW11bnRpZXMKCmBgYHtyfQpjb21tdW5pdGllc19hYnVuZGFuY2Vfc3lsICU+JSAKICBmaWx0ZXIoQ2FyYm9uU291cmNlID09ICJHbHVjb3NlIikgJT4lIAogIGZpbHRlcihDb21tdW5pdHkgJWluJSBjb21tdW5pdGllc19uYW1lKSAlPiUKICBwbG90X2FidW5kYW5jZShsYWJlbF94ID0gIkNvbW11bml0eSIsIGxhYmVsX3kgPSAiUmVsYXRpdmVBYnVuZGFuY2UiLCBmaWxsID0gIkZhbWlseSIpCmBgYAoKCkFidW5kYW5jZSBleHBsYWluZWQgYnkgaXNvbGF0ZSBzZXF1ZW5jZXMsIGFsbG93aW5nIHVwIHRvIHR3byBtaXNtYXRjaC4gCkMyUjYgYW5kIEMxMFIyIGJvdGggbWlzcyBvbmUgaXNvbGF0ZSdzIFNhbmdlciBzZXF1ZW5jZXMuCgpgYGB7cn0Kc2VxdWVuY2VzX2FidW5kYW5jZSAlPiUKICBmaWx0ZXIoQWxpZ25tZW50VHlwZSA9PSAibG9jYWwiKSAlPiUKICBmaWx0ZXIoQWxsb3dNaXNtYXRjaCA9PSAyKSAlPiUKICBtdXRhdGUoQWxsb3dNaXNtYXRjaCA9IHBhc3RlMCgiQWxsb3dNaXNtYXRjaDogIiwgQWxsb3dNaXNtYXRjaCkpICU+JQogIHBsb3RfYWJ1bmRhbmNlKGxhYmVsX3ggPSAiQ29tbXVuaXR5IiwgbGFiZWxfeSA9ICJSZWxhdGl2ZUFidW5kYW5jZSIsIGZpbGwgPSAiRmFtaWx5IikKYGBgCgpBYnVuZGFuY2UgZXhwbGFpbmVkIGJ5IGlzb2xhdGUgc2VxdWVuY2VzLiBSb3dzIGFyZSBkaWZmZXJlbnQgYWxpZ25tZW50IGFsZ29yaXRobXMgdXNlZCBieSBgQmlvc3RyaW5nOjpwYWlyd2lzZUFsaWdubWVudCgpYCwgd2hlcmVhcyBjb2x1bW5zIGFyZSBob3cgbWFueSBicCBtaXNtYXRjaGVzIGFyZSBhbGxvd2VkIGluIEVTVi1TYW5nZXIgbWF0Y2hlcy4gCgpgYGB7ciBmaWcud2lkdGg9NCwgZmlnLmhlaWdodD00fQpzZXF1ZW5jZXNfYWJ1bmRhbmNlICU+JQogIG11dGF0ZShBbGxvd01pc21hdGNoID0gcGFzdGUwKCJBbGxvd01pc21hdGNoOiAiLCBBbGxvd01pc21hdGNoKSkgJT4lCiAgcGxvdF9hYnVuZGFuY2UobGFiZWxfeCA9ICJDb21tdW5pdHkiLCBsYWJlbF95ID0gIlJlbGF0aXZlQWJ1bmRhbmNlIiwgZmlsbCA9ICJGYW1pbHkiKSArCiAgZmFjZXRfZ3JpZChBbGxvd01pc21hdGNofkFsaWdubWVudFR5cGUpICsKICB0aGVtZShsZWdlbmQucG9zaXRpb24gPSAidG9wIikKYGBgCgoKCgoKCgo=